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SPECIFICATION 

System for encoding characters 

5 The subject matter of this invention relates to 
a unified and simplified system for encoding 
characters of various kinds in the world. By 
this system, a kind of universal system that 
can easily process information composed of all 
1 0 kinds of characters in the world can be built 
up with the help of character-information pro- 
cessors (including computers, teleprinters and 
so on). 

Up to now, "special keyboards" are needed 
1 5 to process information composed of characters 
other than English words with which the usual 
"standard keyboard" is enough to cope. 
When multi-kind-character information needs 
processing at the same time, it is necessary to 
20 have additional huge "multi-kind-character 
keybaord". Nowadays very many kinds of 
characters exist in the world, whereas so far 
no method and equipment are available for 
processing information composed of all kinds 
25 of characters. 

The inventor thinks that all kinds of charac- 
ters in the world are formed unexceptionally 
by different strokes. Occidental words are 
formed by letters in linear order which are 
30 constructed by different strokes. Oriental char- 
acters, such as Chinese characters and Japa- 
nese kanas, are 2-D graphs which are also 
formed by various strokes from top to bottom 
and from left to right. Occidental words ap- 
35 pears to differ enormously from Oriental ones. 
Nevertheless, viewing from the fact that let- 
ters are also of a stroke form and a 2-D graph, 
alphabetic writing is in fact ail the same to 
Chinese characters. 
10 A patent application named "System for 
Encoding Chinese Characters" has been previ- 
ously filed, the application numbers are as 
follows:- 

US Patent Application No. P321 
15 UK Patent Application No. GB 2 100 899 
A 

West Germany Patent Application No 
P3217307.5 
Japan Patent Application No. 
>0 The invention is mainly characterized by 
regarding all kinds of character is in the world 
as formed by different strokes which are 
grouped according to their shapes and writing 
directions, and allocating to each category of 
i5 stroke an Arabic number, then encoding the 
strokes in strictly formulated order. 
Detail description of the invention:- 
(1) Stroke Shape Codes: - 
In this invention, strokes which form ail 
0 kinds of characters in the world are classified 
according to their shapes and directions. One 
may choose 8, 4, 16 or 32 categories. Fig. 1 
shows a code table for 8 categories, which 
describes the shapes, directions and names of 
5 the 8 categories, and to each of the eight a 



digit code is allotted. 

If 4-category classification is preferable, left- 
swing and left(clockwise)-turn in Fig. 1 may 
be incorporated into one category, and so do 
70 the right-swing and right(anticlockwise)-turn. 
Besides, both cross and square may be div- 
ided into two kinds of strokes, namely, hori- 
zontal and vertical (or left and right). Thus a 
8-category may be reduced to a 4-category, of 
75 which each group is allotted a number code, 
i.e. 1, 2, 3, 0 (ref. Fig. 2), finally only 
horizontal, vertical, left and right four kinds of 
strokes are left. 
If 1 6-category or 32-category classification 
80 is chosen, sub-divide left(clockwise)-turn 
right- 

(anticlockwise)-turn, cross and square to get 
the desired classification. 
4-, 8-, 1 6-, or 32-category classification is 
85 preferable to 5-, 10-, 20-category classifica- 
tion for reasons of easier adaptation to the 
binary system which is, by adopting such 
binary numbers as 4( = 2 2 ), 8( = 2 3 ) 
1 6( - 2 4 ), or 32( = 2 5 ), etc., widely used in 
90 computers and other information processors 
for storing and processing information both 
conveniently and economically. 

Generally, 8-category classification is most 
commonly used, referring to Fig. 1 . 
95 (2) Rules for Encoding 

In this invention strokes are encoded by 
certain rules. 

There are usually two rules, one is writing 
order rule and the other is positional order 
100 rule. 

Writing order is also known as stroke order, 
i.e. habitual writing order of different kind of ' 
character. It varies with the person who writes 
and with the words which are written. 
105 Positional order is one which is defined by 
the different positions of strokes in a character 
(or a letter). This invention regulates that the 
positional order is: first, the highest position, 
then, the next highest, up to the lowest, this 
1 1 0 is also known as from top to bottom; then, 
left side before right side, this is also known 
as from left to right. 

When comparing two stroke positions, first 
observe the initial points of the two, as shown 
115 in Fig. 3(1), (2). If none is higher than the 
other, observe which one initiates from left 
side first, as shown in Fig. 3(3), (4). When 
the two initial points overlap each other, com- 
pare which termination point is higher, as 
120 shown in Fig. 3(5), (6). If the two terminal 
points have the same position, see which one 
is on the left side, as shown in Fig. 3(7), (8). 

Although encoding in writing order may be 
consistent with the writing habit of a nation, 
125 ambiguity would arise due to th fact that this 
order varies with the person who writes and 
with the words which are written, thus result- 
ing in a less strict and less unified encoding 
system. It may appear to be encoded. 
1 30 Graphic order is a general rule refined from 



writing order rule. It is more scientific and 
enables unification of the system for encoding 
ail kinds of characters in the world. Mean- 
while, it guarantees a unique meaning of each 

5 code. For this reason, graphic order is prefer- 
able in encoding and it results in both conve- 
nience and simplicity in use. 

All kinds of characters in the world can be 
encoded by the above-mentioned method. 

10 Figs. 4-13 show how the encode English, 
French, German, Romanian, Italian, Partugu- 
ese, Spanish, Norwegian (and Danish), Esper- 
anto, Swedish, Russian, Vietnamese, Japa- 
nese kanas, Serbian, Mogolian, and Chinese 

15 characters. 

This system is also suitable to Korean Han- 
gui, Arabic or any character of various na- 
tional forms. 

Of course, a character may be given two 

20 number codes wherever its graphic order dif- 
fers from its writing order, i.e. two different 
codes refer to the same character and one 
more reference source could be used. Thus 
one can find out the exact character (or letter) 

25 by different rules. 

(3) Keyboard for Encoding:- 
The availing standard keyboard can be used 
as encoding keyboard for the present inven- 
tion. One could also, however, have, say, 8, 

30 4, 1 6 or 32 keys. The number of keys varies 
with how many categories of stroke shapes 
are adopted. Owing to the simplification and 
convenience of this system, it is possible to 
miniturize the complicated keyboard otherwise 

35 needed and greatly reduce the volume of 
ancillary apparatus. 

Punctuation marks can also be encoded by 
grouping stroke shapes, e.g. colon ":" con- 
sists of two points, and its digit code is "44"; 

40 question mark "?" has a left(c!ockwise)-turn 
and a point, the code "54" stands for it, and 
so on, thus avoiding using special punctuation 
marks keyboard in the apparatus. 
The great value of reduced keyboard lies in 

45 its enabling corresponding reduction of the 
computers which can be reduced to the size 
of a calculator and can be used in combina- 
tion with a teleprinter and dial telephone to 
transmit information composed of all kinds of 

50 characters in the world. 
The drawings:- 

Figure 1, Table of 8-category system for 
encoding strokes shapes. 

Figure 2, Table of 4-category system for 
55 encoding strokes shapes. 

Figure 3, Graphic order rule based on 8- 
category system. 

Figure 4, Digit codes for capital English 
letters. 

60 Figure 5, Digit codes for small English 
letters. 

Figure 6, Digit codes for complementary 
French, German and Romanian letters. 
Figure 7, Digit codes for complementary 
65 Italian, Portuguese letters. 
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Figure 8, Digit codes for complementary 
Spanish, Norweigan (including Danish), Esper- 
anto and Swedish letters. 

Figure 9, Digit codes for Russian letters. 
70 Figure 10, Digit codes for complementary 
Vietnamese letters. 

Figure 1 1, Digit codes for Japense kanas. 

Figure 12, Digit codes for complementary 
Japanese kanas and Serbian (official Yugosla- 
75 vian language) letters. 

Figure 13, Digit codes for Chinese charac- 
ters. 

Description of the Drawings:- 
In Fig. 4; 

80 (1) Z may be given a code "16" or "6". 
Since the latter is identical with the code for 
C, there will be a confusion if code "6" is 
used for the two. Therefore, an "*" mark is 
added to tell one from the other. Thereafter, 

85 "*" mark appears wherever a code stands for 
different characters. 

(2) L may be given a code "21" or "6", 
for the above-said reason, it is encoded as 6**. 

(3) M should have been encoded as 

90 "234", had the code "234" not differenti- 
ated from other codes. Since the code "243" 
is an effective code, drop the last numeral 2 
which is useless. 

(4) Y may be encoded as "433" or "43"; 
95 the latter is identical with the code for V, add 

mark here. 

(5) W should have been encoded as 
"4343"; since "434" is effective, drop the 
last number 3. 

100 In Fig. 5: 

(1) H may be encoded as "226" or "26". 
Since the letter N anticipates to take "26" as 
its code, the two letters would have same 
code if " # " mark would not be added. 
105 (2) D may be encoded as "260" or "26", 
add mark to "26". 
In Fig. 9: 

3 may be encoded as "5" or "55". The 
latter has been taken, thus add to it. 
110 In Fig. 10: 

(1) ^ may be encoded as "5660" or 
"560"; the latter is identical with other code, 
add "*" mark here. 

(2) i may be encoded as "5606" or 

1 1 5 "506"; the latter is identical with other code, 
add mark here. 

(3) I may be encoded as "50" or "560"; 
the latter is identical with other code add """ 
mark here. 

120 In Fig. 11: 

(1) ^ may be encoded as "51" which is 
identical with other codes; add "*" mark here. 

(2) D may be encoded as "51" which is 
identical with other codes; add mark 

125 here. 

In Fig. 12; 

(1) 7 may be encoded as "54" which is 
identical with other codes; add " # " mark here. 

(2) 7, may be encoded as "54" which is 
1 30 identical with other codes; add mark 
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here. 
CLAIMS 

1 . A method of encoding all kinds of 

5 characters in the world in digits, in which the 
key point is to find out individual stroke in 2- 
D direction of all types of characters the world 
over, including Occidental alphabetic writing 
in 1-D linear arrangement and Oriental ideo- 
10 graphic characters in 2-D arrangement (such 
as Chinese characters), and allocate a digit to 
each specific category of stroke shape, thus 
making it possible to encode all kinds of 
characters in the world systematically. 

2. A method as claimed in Claim 1 in 
which the strokes of all kinds of characters in 
the world which are grouped into 8 basic 
categories of stroke shapes according to their 
respective characteristics and are allotted digit 
codes 1, 2, 3, 4, 5, 6, 7, 0 are horizontal, 
vertical, left-swing, right-swing, left(clockwise)- 
turn, nght(anticlockwise)-turn, cross and 
square (referring to Fig. 1). 

3. A method as claimed in Claim 1 in 
25 which the strokes of all kinds of characters in 

the world which are grouped into 4 basic 
categories of stroke shapes according to their 
respective characteristics and are allotted digit 
codes 1, 2, 3, 0 are horizontal, vertical, left 
30 (including left-swing and left(clockwise)-turn) 
and right (including right-swing and right(anti- 
clockwise)-turn). 

4. A method as claimed in Claim 1 in 
which the strokes of ail kinds of characters in 

35 the world can be grouped into 1 6 or 32 basic 
categories of stroke shapes according to their 
respective characteristics and be allotted digit 
codes accordingly. 

5. A method as claimed in all preceding 
40 Claims in which the stroke shapes are en- 
coded in graphic (positional) order, namely, 
first, the highest position, then, the next high- 
est position, up to the lowest, this is also 
known as from top to bottom; then, left side 

15 before right side, this is also known as from 
left to right. 

6. A method as claimed in Claims 1-4 
which the stroke shapes are encoded in writ- 
ing order, namely, characters of various na- 

)0 tions are encoded according to their respec- 
tive writing habit. 

7. A method of claimed in any of the 
preceding Claims in which a letter or simple 
ideographic character is allotted 1 -3 digits, 

i5 ordinary ideographic character is alloted 6 
digits and no more than 8 digits are used for 
most complicated ideographic character. 

8. A method as claimed in Claims 1-6 in 
which all kinds of punctuation marks can be 

0 encoded by the system for encoding stroke 
shapes. 

9. All kinds of dictionaries, codes and 
character indexes in which characters of vari- 
ous nations are encoded systematically by a 

5 method as claimed in any of the preceding 
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Claims. 

10. All systems for processing character 
information (including computers, teleprinters 
typewriters, dial telephones, etc.), irrespective 
of large or super-mini apparatus, in which all 
kinds of characters in the world are encoded 
systematically by a method as claimed in any 
of the preceding Claims, one-machine-system 
which can process one, two, three, up to all 
kinds of characters in the world by a method 
as claimed in any of the preceding Claims. 

Primed In the United Kingdom for " 

2^?? ty 'l£ ta £ onMy 0ffic8 ' M 6818935. 1986. 4235 

LoSton^WCrJ'i avT 0ffk ?* South ™P">" Buildings. 
London, WC2A 1AY, from which copies may be obtained. 



